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I # Introduction 

A grammatical analysis of a language may be thought of as divided into 
two parts, the syntax and the phonology. The lexicon might be assumed to 
be a separate entity or a part of the syntax. In the phonology of a lan*° 
guage such as Mandarin, where the pitch pattern carries lexical informa- 
tion, the phonology must have one seetion which contains the rules for 
generating this pattern from information provided by the syntax and the 
lexicon. The phonological rules have been classified into six major 
types as follows: ^ 

(1) RS rules— those which add the retroflex suffix and effect a 
change on the final of the syllable $ 

(2) SV rules— those which change vowels in certain positions into 
glides; 

( 3 ) RD rule 8 — those which reduplicate syllables; 

(k) SA rules— those which assign stress; 

( 5 ) TS rules— those which provide for the correct tone sandhi; 

( 6 ) TP rules — those which place the tone on the proper portion of 
the syllable. 

This study is an attempt to discover something about the nature of the 
TS rules, and hopefully to formulate some of them. Assuming for the mo- 
ment that the lexicon is a part of the syntax, the block diagram shown 
in Figure 1 might be assumed to operate in the generation of a pitch 
pattern for Mandarin. 

For our purposes here, we will accept the traditional analysis of 
Mandarin having four distinctive tones, i.e. 1 - high level, 2 - high 
rising, 3 - low dipping, and k - high falling. "Sandhi” rules are those 
which replace one tone by another, whereas the "allotonic" rules are 
those which specify the environmental influence on the actual pitch curve 
of a tone. Although the stress rules are located after the sandhi and 
allotonic rules, it is assumed that the stress positions have already 
been specified, so that the stress rules consist only of the physical 
specifications of the pitch curve which are necessary to produce a sen- 
tence which sounds stressed in the places previously specified. 
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Figure 1* Block Diagram of Pitch Curve Generator 



There are reasons for believing that some hypes of stress must be spe 
cifled prior to the generation of the sentence structure, or at least dur 
ing this generative procedure. As an example of the necessity of knowing 
whether or not any part of a sentence is stressed, consider the following 
examples, one in Mandarin and one in English: 

*1 want to go too.* »Wo y? ydo qd. » (I also/too want to go.) 

In both of these cases the structure is permissible if stress is present, 
but is not permissible when none of the words is stressed. A further 
reason for desiring stress positions to be marked before the phonology 
operates will be illustrated in Section IV, where the operation of a 
sandhi rule is governed by the presence or absence of stress. Stress 
will therefore be assumed to be marked within the syntax component. The 
output of the syntax is assumed to be strings of morphemes with their 
structural descriptions, plus certain phonological information. 

During the initial stages of this investigation it will be assumed 
that the pitch pattern alone is the parameter which carries the informa- 
tion types indicated by Figure 1. Although this is certainly not true, 
it provides a simplification which brings the problem down to a manage- 
able size to begin with. Later in this study, tests will be made to 
determine the influence of other factors, primarily intensity, used by 
listeners to determine the tone of a word. 

As can be readily observed in the block diagram of Figure 1, the 
pitch pattern is carrying information of several types * Omitting the 
effect of intrinsic pitch, since it carries no information, we have 
the single parameter of fundamental frequency (by assumption) carrying 
the following three loads: 
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(1) lexical information? 

(2) stress information? 
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Since this information is carried in a single parameter, we must expect 
to have interaction among the different types of information carried, to 
produce a sequence which has a unique or nearly unique interpretation to 
a native speaker# 

As yet, even the characteristic pitch curves of Mandarin words spoken 
in isolation are not all known, or at least not all agreed upon* A good 
deal of the difference of opinion is due, no doubt, to the fact that 
nearly all the work along this line in the past was done, of necessity, 
in an impressionistic manner# ^ The tool that is probably most commonly 
used now in studying the fundamental frequency of voice is the Sona-Graph. 
Narrow band spectrograms give the researcher accurate, continuous pitch 
contours from which to extract data# 

In this study the Sona-Graph was used very little, for two reasons. 
First, the collection of data by this method takes an exceedingly long 
time# Second, the Sona-Graph does not lend itself readily to an automa- 
tic data processing system which might be used in automatic recognition 
of the spoken tone# The latter reason has considerable bearing on the 
decisions not to use a Sona-Graph, since it is intended that recognition 



tests using the generative rules in reverse be used to aid in the under- 
standing of how these generative rules should be written# While the re- 
verse of the generating procedure can hardly undo conversions of one tone 
into another due to tone sandhi operating, since there is a many-to-one 
relationship between input and output, it is felt that development of 
recognition procedures will provide a valuable first step to understand- 
ing the generative system# Since the recognition procedures will incor- 
porate information gathered from real speech, even a highly accurate 
recognition procedure can only take into account what has already been 
observed# Therefore, upon formulation of an acceptable recognition 
scheme, it will still be necessary to synthesize tone patterns and have 
them judged by native speakers, in order to set reasonably accurate li- 



mits to the permissible variations of pattern# 
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II# Procedure and Equipment for Data Gathering 

A set of sixteen Mandarin words was selected, which had the follow- 
ing properties: 

(1) All words had the same vowel, /<*/# 

(2) The sixteen words could he broken down into four groups, each 
group representing a single consonant -vowel sequence, and dif- 
fering only in that the four basic tones of the language were 
present in each group# 

4 

As was mentioned by Chao, the gathering of data on a pitch curve is com- 
plicated by the tendency of the informant to impose sentence intonation 
on polysyllabic groups, and to shift key when reading sets of isolated 
words# In this experiment the isolated words were merely spaced in time# 
It might be better in the future to use a carrier sentence for the target 
words, in spite of the fact that the tones preceding and following the 
target word will affect the pitch contour# The choice is between accept- 
ing the variations present in isolated words and accepting a uniform 
deformation due to the tone*s environment. 

The original list of sixteen words was read twice by each of the two 
5 

informants, with a pause of several seconds between each word# The 
informants then read two-tuple combinations of each set of segment ally 
identical words in all possible combinations, and finally they read 
three-tuple combinations of the segmentally identical sets in all pos- 
sible combinations# In addition to this, each informant read a set of 
expressions which included ’’neutral tones,” and a set of sentences in 
which the sentence was first read without stress and then reread, stress- 
ing a different word each time# All written lists were prepared in char- 
acters# 

The readings were mad© in a sound-proof recording booth, and recor- 
ded on an Ampex tape recorder at 7-1/2 ips, using ’’Scotch 175*’ re- 
cording tape and an Altec 6S3A dynamic microphone# The recordings were 
then replayed on the Ampex recorder, and the electrical signal was fed 
into a modified ’’Vocoder” pitch extractor circuit# ^ The pitch signal 
was then sent to one channel of a two-channel graphical recorder, the 
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other channel being fed the overall amplitude curve* The graphical re- 
corder was set at 125 mm/sec, and the frequency scale was calibrated at 
5-minute intervals, using a signal generator and a frequency counter. 

These graphical tapes were then edited by hand to mark word boundaries, 
and to write in word and tone# For the words with unvoiced initials, 
the determination of word boundaries was no problem, since the pitch 
contour started at the onset of voicing* For the words with a voiced 
initial the boundary was set by observing both the pitch contour and 
the synchronous amplitude curve, since a voiced consonant has a lower 
intensity than the vowel /a./ * 

Pitch and amplitude values were then read from the tape at 40-msec 
intervals during each word, and recorded on punched cards along with the 
information on tone and environment. These cards were then processed by 
an IBM 7090 computer, to convert the scale on the graphic tapes into a 
frequency scale for pitch, and to plot: pitch on a frequency versus time 
scale; amplitude on a relative scale versus time; and normalized frequen- 
cy on a ratio versus time scale. The normalized frequency was obtained 
by dividing the frequency at each point by the value of the frequency at 
the initial point of the most recent Tone 1. After this output was stu- 
died to determine what parameters might be useful in specifying tones, 

the cards were again run through a computer, to extract the following in- 
7 

formation: 

(1) The lexical tone assigned by the vowel, as well as that of the 
preceding and/or following words. 

(2) Average frequency of the vowel. 

(3) Frequency of a local minimum if one existed. 

(4) Slope toward the local minimum from beginning of the vowel. 

(5) Slope away from the local minimum to end of the vowel. 

(6) Initial and final frequencies, and the difference between them. 

(7) Concavity, whether up or down, and the degree of concavity. 

(8) Location of the minimum in number of 40-msec units from beginning 
of the vowel. 

(9) Number of 40-msec units in the vowel duration. 
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(10) Average slope of the pitch curve from beginning to end of the 
vowel. 

(11) The starting slope of the curve, averaged over the first two 
40-msec intervals. 

These data were then manipulated by hand to determine the effects that 
tone and tone environment had on the various parameters. 

III. Isolated Words 

In Figure 2 are shown some patterns of each of the four basic tones 
of Mandarin. Throughout this report, patterns represented by dotted 
points will indicate the female informant; patterns represented by dashed 
points will indicate the male informant « Both curves are normalized with 
respect to the average frequency of the informants* Tone 1. The solid 
lines represent the average of the curves on each graph, and approximate 
the characteristic pattern of the tone. These curves were selected at 
random from the isolated words spoken that did not have voiced initial 
consonants. The words with voiced initial consonants were eliminated from 
this graph because of the slightly different p*..ch pattern which is pro- 
duced. Figure 3 shows the pitch patterns of wcve having voiced initials, 
with these voiced initials included in the pitch patterns. As can be noted 
in Figure 3» the voicing during the consonant is approximately level, and 
tends toward a middle position in the pitch range. It was assumed that 
the boundaries between consonants and vowels would be known by other me- 
thods, and in this study no attempt has been made to specify the pitch 
contour during the period of voicing for consonants. Naturally this must 
be done before one can completely specify a pitch curve. 

As can be readily seen in Figure 2, the pitch patterns for the four 
basic tones spoken in isolation are quite distinct. Differentiations can 
be made on the basis of the following observations: 

(1) The average frequency of Tone 1 did not intersect with the average 

frequencies of any of the other tones. It is the highest of the 

averages. 

(2) The average frequency of Tone 3 did not intersect with the average 

frequencies of any of the other tones. It is the lowest of the 

averages. 
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Figure 2 (Part A) Words with Unvoiced Initial Consonants 
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figure 2 (Part B) Words with Unvoiced Initial Consonants 
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Tone #1 



Time in 40 msec intervals 
(Circled points indicate voicing in the consonant) 




Figure 3 (Part A) Words with Voiced Initial Consonants (ma) 
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Figure 3 (Part B) Words with Voiced Initial Consonants (ma) 
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(3) Tones 2 and 4, although occupying the same area of average fre- 
quency, i.e* the middle area below Tone 1 and above Tone 3« are 
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Tone 2 is always positive, and that of Tone 4 is always negative* 
Due to the small sample from which the above observations were made, we can 
hardly assume that the average frequencies of Tones 1 and 3 cannot inter- 
sect with those of Tones 2 and 4# It therefore seems advisable to set up 
an intermediate area at the upper and lower ends of the Tone 2, Tone 4 
average frequency area, and made decisions on the basis of other parameters 
in these ranges* Figure 4 presents a decision procedure which is designed 
to resolve ambiguities which might be encountered* At present this proce- 
dure had not been tested by a computer, but it will be in the near future* 
Until such time as the recognition procedure shall be developed to an 
acceptable state of accuracy, no attempt will be made to formulate rules 
for the physical pitch curves of the isolated words. Appendix 4 gives 
average, maximum, and minimum values for the pertinent parameters of each 
tone spoken in isolation. 



IV. Words in Sequence 

As soon as we start dealing with words in sequence we face the problems 
of tone sandhi and allot onic change* It must be understood that the sandhi 
and allotonic changes discussed below do not apply to the reduplicated ex- 
pressions in Mandarin* These expressions have an unusual behavior and must 
be handled separately, probably in the syntax* There is also a small set 
of words which have unusual tone behavior* These words are Bd, YI, Chi, 
and Ba j -fcj y\). The tones marked are the normal citation tones* 

The most widely known sandhi change in Mandarin is the changing of a Tone 
3 into a Tone 2 when it precedes a Tone 3* i#e* 33 — — * 23* Even this 
change was open to debate, since some writers & claimed that the Tone 3 
did not really become a Tone 2 but rather a new tone, called Tone 5* While 
there is some evidence that the Tone 3 does not become a perfect Tone 2 in 
this environment* Tone 3 is perceived as, and indistinguishable from, a 
Tone 2 by native speakers, as shown by the experiment of K-P. Li. 9 The 
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only other tone sandhi commonly mentioned is that of Tone 2 becoming a 
Tone 1 in certain environments# As can be deduced from the examples pre- 



follows either Tone 1 or Tone 2 and does not precede a pause. In addition 
to some rules for neutral tones, with which we are not dealing at the pre- 
sent time, Chao also mentions two alio tonic changes; that of a Tone 4 not 
falling as low as usual when it precedes another Tone 4, and that of a 
Tone 3 not rising when it precedes another tone# Hence we could write 
Chao's sandhi and allotonic changes as follows, with all rules repeatable# 
The mark ( ) indicates that any of the tones inside may appear in this po- 



In observing the patterns produced when various words of a sequence 
were stressed, it became apparent that the sandhi rule which changes a 
Tone 3 into a Tone 2 was quite independent of stress, i#e# the combina- 
tion 33 became 23 no matter where the stress was located, and when the 
first Tone 3 was stressed it appeared identical to a stressed Tone 2* The 
change of Tone 2 into a Tone 1 was not of this type# A Tone 2 became a 
Tone 1 in the correct environment only when not stressed# When stressed, 
the Tone 2, which had supposedly changed into a Tone 1, appeared identical 
to a stressed Tone 2# This is one of the reasons mentioned in Section I 
for desiring the stress to be marked before the phonological rules func- 
tion# The sandhi rule converting a Tone 2 into a Tone 1 is therefore 
sensitive to stress, whereas the rule converting a Tone 3 into a Tone 2 



sented by Chao in his Mandarin Primer, Tone 2 becomes a Tone 1 when it 



sition 



(1) 33 — — 23 




(3) 44 — — * 4*4, where 4* denotes a decrease in the amount of 



fall of the fundamental frequency# 

(4) 3 [ 1 \ 3* 



4) 3 [ 1 \ 





2 



, where 3' denotes a decrease in the amount of 
rise of the fundamental frequency after the 
minimum# 
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Figure 4. Separation of Tones, Words Spoken in Isolation 
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is not. For the present we will assume that the allotonic rules are not 
sensitive to stress# When stress has been investigated systematically, 
we will be able to determine whether this assumption is useful, or whether 
we should specify all or some allotonic changes as sensitive to stress. 

Listed below are some sandhi and allotonic rules which have been de- 
rived from the speech of the two informants. 'V' will be used to denote 
a pause; "X" will be used to dentoe "no pause." The rules are ordered and 
repeatable, and operate from left to right starting at a pause. In each 
rule the number refers to the tone and all of its allotones, All referen- 
ces to changes are to changes with respect to the tone , not with respect 

to the allotonic state of the tone. Thus, rule #9 does not raise the 

average slope another 30% above the level obtained by rule #8 functioning. 

(1) 3 2 in environment (hereafter abbreviated env. ) 3 

(2) 2 — — ► 1 in env. ^ X 

(3) 1 — — * 1* in env. ? . where 1* has its average frequency 

lowered & 3%t and 2* has its average 
frequency lowered ~ 5%, and its ave~ 
rage slope raised ^50%. 

(4) 1 ► 1* in env. 4 

(5) 3 — * 3* in env. X where 3* has its average frequency 

raised 0& 15#» and its average slope 
lowered ^15096. 

(6) 4 — — ♦ 4* in env. X . where 4* has its average slope lowered 

<^ 30 %. 

(7) 4 — * — * 4* * in env. 3 » where 4** has its average slope lowered 

^15#. 

(8) 4 — — * 4* • • in env. X . where 4 ,,f has its average slope raised 

^ 30 %. 

iv i 

(9) 4 — ■ — ♦ 4 in env. 1 . where 4 iv has its average frequency 

raised ^3%t and its average slope 
raised &30%. 

V v 

(10) 4 — — * 4 in env. 4 . where 4 has its average slope raised 

^30%. 
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The amounts of change in the parameters as indicated by the allotonic 
rules are of necessity rather vague, since there is considerable vari- 
ation permissible * The above values merely represent average effects 
observed from the speech of the two informants# 

A rather crude attempt has been ma^ * to code the tones and the rules 
into a binary type coding slightly similar to that proposed by Helen 
Wong. 11 The main difference between this coding and that proposed by 
Wong, or that of distinctive features applied to segmentals, is that 
the plus-minus indicators are not necessarily diametrically opposed. 
Specifically, the features TH1, TH2, etc. (the TH stands for threshold) 
are not in diametric opposition but are merely a change of coding. Some 
simplification of the rules is made possible by using this type of rules, 
which might be construed to indicate that this approach will be fruitful 
in the future. 

We begin by defining four tones, 1, 2, and 4, which are specified 
by giving two features, high-low and dynamic-static. From these four 
tones we generate the 12 alio tones indicated in the above rules, by add- 
ing change features in a binary coding# Figure 5 gives the entries 
which will be used in the following manipulations to specify the tones 
and their allotones as shown. 

12 13 

It has been demonstrated by Halle and Bever that rules may be 
simplified by introducing variables in the notation# In the rules shown 
below the variable "a” will be used to indicate whether the place which 
it occupies should be the same sign or different from that of a previous- 
ly used ’’a” in the same rule, e#g# ?a high) — — ■* £a hig3 indicates that 
different signs should occupy the two positions, whereas & hig£)— — (a 
dynamij indicates that the same sign (+ or -) should occupy both posi- 
tions. Once again, ,, . M will be used to denote a pause, and "X" will be 
used to indicate M no pause# M The rules are ordered and repeatable, that 
is, rule number n should be consecutively applied as long as its condi- 
tions are met, before going on to rule number (n + 1). The rules apply 
from left to right on a morpheme string, starting at a pause# 
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Rule (1) is a sandhi rule which changes Tone 3 into Tone 2 when it 
precedes another Tone 3* Rule (2) is a sandhi rule which changes Tone 2 
into a Tone 1 when it follows either Tone 1 or Tone 2 and does not pre- 
cede a pause. Rule (3) is an allotonic rule which modifies the average 
frequency of Tones 1 and 2 and raises the slope of Tone 2 when either 
tone follows a Tone 3 or a Tone 4. Rule (4) is an allotonic rule which 
modifies the average slope of Tone 4 and changes the average frequency 
and the average slope of Tone 3 when either tone does not precede a pause. 
Rule (5) is an allotonic rule which modifies the average frequency and 
average slope of Tone 4 by amounts which depend on whether the Tone 4 
pr» v des a Tone 1 or another Tone 4. Rule (6) is an allotonic rule which 
modifies Tone 1 in the same manner when it precedes a Tone 4 as it was 
modified when following either Tone 3 or Tone 4* Rule (7) is an allo- 
tonic rule which changes the average slope of a Tone 4 when it precedes 
a pause. Rule (8) is an allotonic rule which further changes the slope 
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of a Tone 4 if it follows Ton© 3 when preceding a pause. A correlation 
between the two sets of rules, in terms of which rule is replaced by which 
rule, and how many binary bits are saved in the process, is given below. 

The first rule number mentioned refers to the binary rules, and the 
second number refers to the previous set of rules. 

Rule (1) replaces Rule (l), there is no saving of information. 

Rule (2) replaces Rule (2), there is a saving of 4- bits in the binary 
rule. 

Rule (3) replaces Rule (3), there is a saving of 9 bits in the binary 



rule. 

Rule (4) replaces Rules (5) and (8), there is a saving of 8 bits in 
the binary rule. 

Rule (5) replaces Rules (9) and (10), there is a saving of 12 bits in 
the binary rule. 

Rule (6) replaces Rule (4), there is a saving of 2 bits in the binary 
rule. 



Rule (7) replaces Rule (6), there is a saving of 4- bits in the binary 



rule. 

Rule (8) replaces Rule (7), there is a saving of 4 bits in the binary 
rule. 



Thus, there is a saving of 44 bits when the binary rules are used in 
place of the previous rules. 

As is immediately apparent from Figure 5, the coding of the allotones 
is quite arbitrary with relation to the binary bits, TH1, TH2, and TH3. 
This arbitrariness indicates that the subtle relations between tones have 
not yet been discovered. Until such time as one can specify contrastive 
binary bits which have physical meaning in their own right, and also faci- 
litate the writing of the rules, it must be assumed that one is working 
with rather artificial codings, which will not reveal the common features 
of tone languages, or show how these features interact in producing the 
effects of sandhi and allot^nic change due to environment. 

What is lacking, then, is a distinctive features type framework with 
which to describe the fundamental frequency parameter of a tone language. 
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The sandhi change of a Tone 3 into a Tone 2 is evidenced by a change 
in the average slope of the Tone 3 in the correct environment to that of 
a similarly situated Tone 2. The average slope after this change is 
quite different from the average slope of a similarly situated natural 
Tone 2 for one of the informants, but quite similar for the other* If 
the differences were relatively constant, then there would be a chance 
to specify the original tones in a recognition procedure with one less 
ambiguity* 

The other sandhi change, 'that of a Tone 2 into a Tone 1 in certain 
environments, is apparently quite complete for one of the informants. 

The other informant, however, has the average frequency of a derived 
Tone 1 quite far below that of a natural Tone 1. This informant also 
retains a good deal of the upward slope of the Tone 2 when it becomes 
a Tone 1, so that one might seriously question whether this change is 
actually to be included under the classification of shandh, or whether 
it should be referred to as merely an allotonic change in which the allo- 
tone closely resembles a Tone 1 in the same environment* The former 
course was chosen because it is the one specified by Chao; the evidence 
was not strong enough in this brief sample to challenge his decision* 

The neutral tone has been omitted from consideration at present, al- 
though a small amount of data on it has been collected* Upon examination 
of this data, it appeared that the introduction of the neutral tone also 
introduced some stress in the surrounding tones, as evidenced by a marked 
rise in the average frequencies of preceding and following tones. While 
this effect may have been due purely to an unwise choice of phrases which 
contained the neutral tone, or to an inadvertent change in the speech of 
the informants due to reading phrases which were not nonsensical, it seems 

reasonable that we should expect stress to occur when "non— stress” is in- 

12 .}. 

troduced* It therefore appears likely that the study of the neutral 
tone will have to be concurrent with the study of stress. An additional 
problem in dealing with a neutral tone will be that of setting the cri- 
teria for "neutrality, M i.e* when is a tone no longer in possession of 
characteristics which might differentiate it from other tones? 
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15 

Pike has discussed this problem of tone neutralization and has set 



a criterion for determining when a tone is neutral# This criterion might 
be simply stated by saying that a tone becomes neutral when one cannot 
distinguish it from all of the other tones in the language* Or, to state 



identify exactly which of the basic tones of the language it is supposed 
to be* The difficulty in applying this criterion is that it assumes know- 
ledge of exactly what variations are permissible for a given tone in a 
given environment, i.e* the tone is not neutral just because one cannot 
distinguish which tone it is, unless the recognition system used is per- 
fect, This approach, however, is not to be ignored simply because one 
needs to assume a perfect analysis system* One can just as well use an 
empirical approach, by setting up the areas of neutrality to the best of 
one's knowledge, and then working to shrink these areas of ambiguity by 
improving the recognition scheme* 

Another factor worthy of comment is that such a criterion for neu- 
trality wi*l introduce a whole host of neutral tones in place of the one 
neutral tone which is assumed in the classical analyses of Mandarin* That 
is, can have neutral tones which are ambiguous as to which of two tones 
the zone in question is supposed to be, or the ambiguity can be among three 
tones, etc* If we define the order of neutrality to be #1 when the ambi- 
guity is between two tones, and to be #2 when the ambiguity is among three 
tones, and so forth, then we have a set of neutral tones in each of the or- 
ders possible# For a tone language with n contrastive tones we can write 
an expression for the maximum number of neutral tones which can occur in 
each order as follows? 



it another way, a given tone is a neutral tone when it is impossible to 



(ORDER #1) 



max 




a n* (n-1 ) 
21 



(ORDER #2) 



max 




n* (n-l)» (n-2) 

31 



n* (n-1 )• (n-2 ) 



(n-D) 



max 



B+l 
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(ORDER #5) 




- 22 - 



Thus, the maximum number of neutral tones in 
(# of neutral tones) a C? + C n 

mdY D 



the language is given by: 

+ • • . + C a 

n 



n* (n-1) 

2.1 



+ n« (n-l)« (n-2) + ### 

31 



+ **.*. (n-1)* (n-2) . (n-n+1) 

nj 



While it is improbable that there will be the ma: imum number of con- 
trastive neutral tones of a given order except for the order #( n -l), this 
type of approach is still quite different from the classical approach, in 
whicn all tones which become neutralized become completely neutralized, 
i*e» of the order #(n -1). 



From the standpoint of a generative system, these various orders of 
neutrality need not concern us, since we merely specify the possible vari- 
ations for each tone as a function of its environment, and do not worry 

abOU b nac CT sot novini*? **<-*4 kl a — j ^ _ . « . , 

wmw ?c*iia^iv;us xHterseci with 

these specifications. (In the classical treatment of Mandarin, however, 
the generation of the neutral tone is not determined by tone and environ- 
ment, but rather by lexical information and environment.) From the recog- 
nition standpoint, however, we must be conscious of the fact that we can 
have several levels of ambiguity, and several ambiguities at each level. 

Appendix 4 gives the \ r alues of some of the extracted parameters as a 
function of environment, for the reader* s information. 



V. Summary 

The form of a generative system for fundamental voice frequency in 
Mandarin is discussed, and several assumptions are made in order to reduce 
the problem to a manageable size. Data was gathered from two speakers, 
by having them read a prepared list which contained isolated words, two- 
tuples, and three-tuples, in all possible combinations of the four basic 
tones. The data gathering system utilized a modified ‘'Vocoder” pitch 
extractor, a two-channel graphical recorder, and an electronic digital 
computer, which was used to plot out the pitch curve, and also to ex- 
tract various parameters of the pitch contour. Examples were given of 
the four basic tones of Mandarin, and, on the basis of the data gathered, 
a set of rules was proposed which would account for the tone sandhi and 
certain allotonic changes evident in Mandarin speech. 
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Monosyllables 

-jb b 

It & 
it, ft 

\£r? J; 

m ^ 



Two- tuples 




Three- tuples 

it b 
ft %r 

ip ft 

fh it 



- 25 - 



Appendix 1 











-26- 



D, Neutral Tones 
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Appendix 2 

Linguistic Backgrounds of the Two Informants 

and 

List of Test Materials Read 



Informant 


it 1 


# 2 


Name 


Kung-Pu Li 


Lillian Liu 


Sex 


Male 


Female 


Age 


29 


26 


Dialect 


Mandarin 


Mandarin 


Place of Birth 


Shanghai 


Shanghai 


Elementary School 


Peking 


Shanghai, Nanking 


High School 


Peking, Taiwan 


United States 


College 


Taiwan. United 
States 


United States 


Dialect spoken by parents 


Fukianese, Mandarin 


Shanghai, Kiangsi 
Mandarin 


Dialect spoken at home 


Mandarin 


Shanghai, Kiangsi 
Mandarin 


Other dialects 


— 


Cantonese 


Prof-ciency in other 
languages 


English, fair 


English, fluent 
Spanish, fluent 



PITCH EXTRACTOR 
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os 

w 

« 

fi 

w 

Q 



M 

g 
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O 
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Q 

a 
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U 
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o 
« f 
h 
-P 

pS 

;d 

o 

•H 



<H 

o 

0 ) 

a _ 

2J rj 

Eh Eh 



0 

CO 

s 



« 

CO 

a 

o 

ft 

CO 

« 

h 

B 

-P 

49 

►> 

CO 

H 

H 

(0 

0 ) 

> 

o 



o 

» 

B 

O 

K\ 



k 

4 > 

M 

o 

4 ) 

T» 

3 

•P 

•H 

H 

ft 



<H 

O 

4 ) 

S 



4 ) 

a 

id 

os 



N> 

•H 

*d 

d 

4 > 

ft 

4 * 



Block Diagram Analysis Scheme 
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Appendix k 

Definition of Parameters Extracted from Pitch Curves 



R ss Number of data points «* 1# 

R 

Average Frequency = 5* Frequency 

n g o n 

R + 1 

Frequency of Minimum * Frequency of that point whose frequency is lower than 
the point preceding it and lower than or equal to the frequency of the 
following point* 



Slope 



toward minimum = ” ln ^ nam frequency- Initial frequency 

Number of 40-msec intervals between initial and minimum* 



Slope from minimum a 



Final frequency-Minimum frequency 

Number of 40-msec intervals between minimum and final point# 



Starting slope = £33HS£gg ,lLl g* d a*a Point-Frequency ->f initial point 



Average slope - Frequency of final point-Frequency of initial point 

• F Number of 40-msec intervals between initial and final points. 



Relative slope = Average slope-initial slope. 



Concavity was called upward if the relative slope was positive, and called 
downward if the relative slope was negative. 

The magnitude of the relative slope serves as a measure of concavity. 
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Appendix 5 

Acoustical Data Extracted 
A* Data for Isolated Tones 



Informant # 1 


avarage 


maximum 


minimum 


Tone Is 








average frequency- 


136.5 


145.4 


129*2 


average slope 


1.52 


3.34 


0.56 


Tone 2: 








average frequency 


113.0 


117.2 


109.0 


average slope 


4.96 


6.68 


3.3*f 


no minimums are 
evident 








Tone 3* 

average frequency 


90.8 


95.5 


87.6 


average slope 


1.36 


3.58 


-0.37 


average minimum 


80.4 


85.0 


71.7 


Tone kx 








average frequency 


117.0 


125.1 


109.2 


average slope 


10.6 


7.4 


13.36 
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Informant # 2 


average 


maximum 


minimum 


Tone 1: 


average 


frequency 


231.9 


242.9 


220.6 


average 


slope 


1.09 


4.38 


-1.67 


Tone 2: 
average 


frequency 


195.0 


202.1 


190.5 


average 


slope 


3.83 


6.88 


0.4-2 


average 


minimum 


187.1 


195.0 


185.0 


Tone 3? 


average 


frequency 


151.8 


159.1 


145.8 


average 


slope 


-5.65 


-3.00 


-14.17 


average 


minimum 


142.0 


152.5 


135.0 


Tone 4: 
average 


frequency 


198.2 


218.5 


182.5 


average 


slope 


-17.1 


-11.7 


-23.8 







SEE a: 



1 
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Appendix 3 

Acoustical Data Extracted 



B. Data for Tones in Sequence 



Informant # 1 Informant # 2 











average 


maximum minimum 


average 


maximum 


minimum 


Tone 


1: 




















1 




AV 


130 


136 


124 


246 


269 


234 


wil V • 


fit 


e 


AVSL 


-.33 


.42 


-2.09 


-.84 


1.5 


-8.13 


ATIV. 


p 




AV 


127 


131 


122 


246 


263 


233 


v AA » 9 


hn> 


f 


AVSL 


.50 


2.23 


-1.67 


-.45 


5.0 


-4.07 


ATI V 


* 




AV 


123 _ 


135 


117 


233 


244 


211 


0X1 V • 


D 


• 


AVSL 


l. 


ft A 5 

» r gpT 


-1.25 


-1.21 


7.5 


-31.7 


AT1V- 


4 




AV 


122 


147 


104 


2 31 


256 


194 






• 


AVSL 


.86 


3.34 


-2.09 


.75 


6.25 


-2.50 


ATI V - 




1 


AV 


128 


135 


122 


241 


265 


226 


Oil M • 






AVSL 


1.24 


5.01 


-.84 


2.59 


6.5 


0 


ah v - 




O 


AV 


125 


133 


120 


240 


257 


226 


0X1 v • 




c. • 


AVSL 


1.11 


5.85 


-2.0 


.92 


8.1 


-7.5 


ATI V a 






AV 


125 


137 


121 


244 


259 


232 


w Ju v e 






AVSL 


1.59 


4.17 


.33 


3.14 


8.3 


-2.4 


AMTf ~ 




4. 


AV 


122 




114 


237 


253 


224 


W i-1 V • 






AVSL 


2.38 


5.84 


1.25 


1,72 


8.1 


-3.2 


Tone 


2: 


















Anv. 


1 




AV 


111 


120 


105 


191 


203 


181 


WU V . 


JL 


» 


AVSL 


2.57 


5.01 


-.42 


.64 


7.5 


-4.38 








MIN 


99 


101 


95 


180 


190 


172 








AV 


111 


118 


104 


189 


205 


176 


env. 


2_ 


• 


AVSL 


2.24 


4.45 


-.84 


2.16 


7.5 


-2.9 








MIN 


103 


112 


98 


178 


190 


165 








AV 


105 


111 


99 


177 


190 


165 


env. 


3_ 


• 


AVSL 


3.96 


7.24 


-1.34 


7.43 


15.0 


3.7 








MIN 


95 


102 


88 


166 


178 


160 








AV 


104 


111 


92 


182 


190 


170 


env. 


4_ 


4 


AVSL 


2.60 


5.43 


0 


6.74 


9.0 


1.7 








MIN 


97 


105 


90 


173 


180 


167 
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AV 

AVSL 

MIN 



env. 




AV 

AVSL 

MIN 



env. 




AV 

. AVSL 
MIN 



env. 




AV 

. AVSL 
MIN 



env. 




AV 

AVSL 

MIN 



env. 




AV 

AVSL 

MTN 



env. 




AV 

. AVSL 
MIN 



env. 




AV 

AVSL 

MIN 



Tone 3: 



AV 

env. 1 . AVSL 

MIN 

AV 

env. 2 . AVSL 

MIN 

AV 

env. 3 . AVSL 

MIN 

AV 

env. 4 , AVSL 

MIN 



Informant if 1 



average maximum mini miini 



118 


122 


114 


♦ 94 


8.35 


-2.92 


116 


118 


111 


123 


125 


121 


.21 


3.34 


-2.92 


121 


123 


120 


117 


126 


90 


3.04 


12.11 


-1.34 


104 


125 


68 


120 


128 


120 


♦ 58 


2.34 


0 


103 


105 


105 


111 


115 


105 


4.26 


7.51 


1.67 


108 


112 


105 


115 


118 


110 


4.57 


7.51 


2.92 






— - 


113 


117 


no 


4.52 


7.24 


1.95 


110 


no 


no 


112 


119 


106 


5.56 


8.35 


1.25 


110 


no 


no 



s * 

''y 


98 


77 


-1.53 


1.95 


-6.12 


75 


87 


68 


85 


104 


76 


-1.21 


1.0 


-4.34 


79 


% Aft 

J.UC 


70 


88 


94 


78 


-1.97 


-0.28 


-5.01 


78 


87 


70 


82 


87 


72 


-0. 66 


2.39 


-3*34 


72 


78 


68 



Informant # 2 
svsrags maximum minimum 



216 


228 


207 


1.58 


5.0 


-5.4 


207 


210 


205 


214 


223 


205 


4.03 


10.0 


-1.7 


206 


210 


203 


222 


237 


212 


6.05 


10.8 


-0.4 


212 


222 


205 


207 


231 


191 


2.75 


6.7 


-2.5 


203 


215 


192 


206 


223 


191 


11.7 


14.5 


6.5 


195 


195 


195 


199 


218 


184 


9.32 


14.6 


3.8 


207 


207 


207 


214 


235 


-» r 


13.2 


20.3 


5.4 






— - 


190 


201 


l8l 


9.36 


13.3 


4.1 


188 


188 


188 



162 


174 


152 


-6.72 


0 


-12.8 


150 


164 


142 


162 


176 


151 


-6.73 


.4 


- 12.1 


149 


155 


142 


159 


183 


132 


-5.98 


0 


-12.9 


148 


159 


125 


158 


165 


146 


-1.41 


5.8 


-6.7 


150 


1 55 


142 





Informant # 1 



Informant # 2 




average maximum minimum average maximum minimum 









AV 


90 


105 


75 


179 


214 


160 


env. 




JL. 


AVSL 


-4.83 


#28 


-9.18 


-6.62 


-1.9 


-12.2 








MIN 


80 


93 


73 


158 


170 


142 








AV 


92 


102 


85 


179 


199 


161 


env. 




_2. 


AVSL 


-5.97 


-0.67 


-13.36 


-9.7 


-2.5 


-17.1 








MIN 


83 


90 


78 


163 


170 


155 








AV 


118 


124 


112 


216 


230 


188 


env# 




J • 


AVSL 


2#54 


8.35 


-1.39 


5.1 


11.7 


—8.8 








MIN 


116 


123 


107 


212 


215 


210 








AV 


90 


100 


78 


170 


195 


153 


env# 




_4* 


AVSL 


-4.42 


10.02 


-15.03 


-5.38 


-0.7 


-9.8 








MIN 


83 


92 


72 


157 


165 


147 


Tone 


4: 
























AV 


115 


122 


102 


217 


233 


195 


env# 


1 


♦ 


AVSL 


-11.6 


-5.34 


-19.48 


-27.25 


-19.2 


-37.4 








AV 


11 4 


^.U4> 


1 r\r\ 
XV 7 


r* 

g-Lp 


239 


183 


env# 


g 


♦ 


AVSL 


-13.55 


-6*68 


-17.81 


=25.2 


1 •. *— 

-Xp.p 


-31.7 








AV 


112 


122 


105 


215 


232 


206 


env# 


3_ 


♦ 


AVSL 


-10.2 


-2.78 


-18.37 


-23.4 


-14.6 


-32.5 








AV 


114 


125 


97 


219 


234 


207 


er'v# 


4 


# 


AVSL 


-11.9 


-4.68 


-17.53 


-20.6 


-9.4 


-35.4 








AV 


119 


125 


114 


234 


255 


215 


env# 




1 • 


AVSL 


-6.15 


-1.34 


-11.69 


-9.69 


-4.2 


-17.5 








AV 


115 


121 


110 


221 


237 


203 


env# 




__2. 


AVSL 


-5.94 


0 


-8.35 


-13.8 


-5.6 


-26.9 








AV 


115 


125 


108 


225 


243 


212 


env# 




3 * 


AVSL 


-6.95 


-3.67 


-10.58 


-11.78 


.36 


-22.5 






1 . 


AV 


118 


126 


110 


223 


238 


207 


env# 




4# 


AVSL 


-3.30 


1.67 


-7.93 


—9.88 


-2.1 


-17.9 










